Hurst's Rescaled Range Statistical Analysis for Pseudorandom Number 
Generators used in Physical Simulations 
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The rescaled range statistical analysis {RS) is proposed as a new method to detect correlations 
in pseudorandom number generators used in Monte Carlo simulations. In an extensive test it is 
demonstrated that the RS analysis provides a very sensitive method to reveal hidden long run and 
short run correlations. Several widely used and also some recently proposed pseudorandom number 
generators are subjected to this test. In many generators correlations are detected and quantified. 
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I. INTRODUCTION 



Random numbers are the essential ingredient of all 
stochastic simulations. A great many algorithms in 
Monte-Carlo (MC) simulations and other non-physical 
computational fields rely crucially on the statistical prop- 
erties of the random numbers used. High precision calcu- 
lations on nowadays computer hardware typically involve 
the generation of billions of random numbers. 

Today the most convenient and most reliable method 
of obtaining random numbers in practice is the use of 
a deterministic algorithm. Such a numerical method 
produces a sequence of pseudorandom numbers (PRN) 
which mimic the statistical properties of true random 
numbers as good as possible. Usually the pseudoran- 
dom number generator (PRNG) is assumed to generate a 
sequence of independent and identically distributed con- 
tinuous U(0, 1) random number, that means uniformly 
distributed over the interval (0,1). Other distributions 
can be obtained by transformation methods Since 
the state space of the generator is finite the sequence 
of PRNs will be eventually periodic. Therefore the ex- 
pected properties of "true" random variables can only be 
approximated. 

True random numbers can only be produced by phys- 
ical devices that generate events which are principally 
unpredictable in advance, such as noise diodes or gamma 
ray counters. But such devices are inconvenient to use 
and Marsaglia reported that several commercial products 
fail standard statistical tests spectacularly |||,^. An al- 
ternative could be the archiving of random numbers of 
high quality on a CDROM ||], although such a source is 
by far not as convenient to handle as a simple function 
call. 

While theoretical test methods |^,|^ , such as the anal- 
ysis of the lattice structure |^ of linear congruential gen- 
erators, are certainly the starting point for constructing 
a good PRNG there is also a strong need for so-called 
empirical tests. These view the PRNG under consider- 
ation as a black box and statistically analyze sequences 
of numbers for various types of correlations, regardless of 
the generation method. There is a large battery of stan- 



dard tests |-|,0,|§ which every candidate to be used 
in "serious" simulations has to pass. PRNGs that have 
succeeded in all of these tests seemed to work reliable 
in apparently all physical simulations until the last few 
years. But the rapid development of computer hardware 
and improved simulation algorithms have caused the de- 
mands on the quality of the random number sequences 
to greatly increase. As a consequence erroneous results 
have been found in recent high precision MC calcula- 
tions. The errors could be related to the use of popular 
PRNGs in combination with some specialized algorithms 
P which revealed hitherto undetected correlations in 
the pseudorandom sequences. 

Thus there is a strong need to enlarge the tool box 
of empirical tests to gain confidence in newly proposed 
PRNGs [p^"p7[ and to check whether traditionally used 
PRNGs are still reliable in modern applications. Any 
good statistical test should have an idiosyncracy for un- 
wanted correlations and detect defects before they show 
up in an application. Newly developed and highly spe- 
cialized algorithms may be sensitive to structural defects 
in PRNGs which are not evident in the standard tests. As 
different tests detect different types of defects it is desir- 
able to develop application specific tests |p^|-pT[| that are 
especially sensitive to the features of the random num- 
bers which are probed in simulations in current fields of 
research. But often this cannot be assessed in advance 
and the only way to reassure oneself of the correctness of 
a suspicious (or very important) result is to perform an 
in situ test and to repeat the simulation with some differ- 
ent PRNGs. Enlarging the set of test methods therefore 
can help to save precious time and to avoid painful re- 
calculations. 

In section |l| a new test method is proposed which is 
applied to a set of several popular generators described 
in section In section [V the results of the numeri- 
cal experiments are discussed illustrating the capability 
of the new test. Following the conclusions, section ^ 
additional results are tabulated in the appendices. 
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II. THE RS ANALYSIS 

In the following I describe a new technique for judging 
the quality of PRNGs in at least several physically rele- 
vant situations. It will be demonstrated that the rescaled 
range statistical analysis {RS analysis) provides an ex- 
tremely sensitive method for revealing hidden correla- 
tions in PRNGs. 

As this method is based on general statistical prop- 
erties expected for an independent Gaussian process it 
should also be useful as a general tool to test the suitabil- 
ity of a PRNG in a wide class of stochastic simulations. 
In the sequel it will be shown that it is especially effective 
for testing the presence of long run statistical dependence 
and in cases where such a correlation is present, for es- 
timating its intensity. In addition it is shown that also 
short run cyclic components in a pseudorandom sequence 
are easily made evident using the RS statistic. 

Hydrology is the oldest discipline in which noncyclic 
long run dependence has been reported. In particular 
the RS analysis has been invented by Hurst |^,^ when 
he was studying the Nile in order to describe the long 
term dependence of the water level in rivers and reser- 
voirs. Later his method has gained much attention in the 
context of fractional Brownian motion . 

The RS statistic for a series £,t in the discrete integer 
valued time is conventionally defined as follows: 



(1) 



R{s) — maxX(i,s)— min X{t^s) 

l<t<s l<t<s 



Sis) 



t=i 



RS{s) = R{s)/Sis) 

Viewing the £,t as spatial increments in a one- 
dimensional random walk then X]t=i is the distance 
of the walker from the starting point at time s. In the 
quantity X{t, s) the mean 
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(2) 



over the time lag s — 1 is subtracted to remove a trend if 
the expectation value of is not zero. In the sequel the 
difference between the final time s and the initial time 1 
of the stochastic process will be termed the lag r = s — 1. 
R{t) is the self-adjusted range of the cumulative sums 
and RS{t) is the self-rescaled self-adjusted range, which 
is the quantity of our interest. 

Feller has proved that the asymptotic behavior for 
the expectation value of any independent random process 
with finite variance is given by 



The combination R{t)/S{t) has a better sampling sta- 
bility than R(t), in the sense that the relative deviation 
of RS, defined as ARSir) = y/YaiiRS {t)] / E[i?S'(r)], 
is smaller For an independent Gaussian process the 
limiting standard deviation is 



lim Var[i?5(T)] ^7r2/6-7r/2 w 0.2723. 



(4) 



On the other hand Hurst had found empirically that 
many time series of natural phenomena are described by 
the scaling relation 



RS{t) oc t". 



(5) 



lim E[t-5 RS{t)] = 



(3) 



where H differs significantly from 1/2. In the context 
of fractional Brownian motion p^ , p6 | a Hurst exponent 
oi H = 1/2 corresponds to the vanishing of correlations 
between past and future spatial increments in the record. 
For H > 1/2 one has persistent behavior, that means 
a positive increment for some time in the past will on 
the average lead to a positive increment in the future 
(if the increments are distributed symmetrically around 
zero). Correspondingly the case H < 1/2 denotes 
antipersistent behavior. 

Thus almost all long run correlations in the stochastic 
process should show up in deviations from the asymptotic 

Furthermore, Mandelbrot and Wallis have de mon- 
strated that the value of the asymptotic prefactor 
is not robust with respect to short run statistical depen- 
dence [^ . This value can be arbitrarily modified by 
cyclic components in the random process. The superpo- 
sition of a white noise (with zero mean and unit variance) 
and a purely periodic proce ss, for instance, leads to an 
asymptotic value of •\/r7r/2 (1 -t- A/2)~^/2, with A being 
the amplitude of a sine wave. Moreover, the transient to 
the asymptotic is not smooth, but typically shows a series 
of oscillations, resembling the case of a purely oscillatory 
process ]2^ . 

Therefore the RS statistic is perfectly suited to analyze 
a stochastic process for correlations on all scales. 

In the following section several types of PRNGs will 
be used to generate C/(0, l)-distributed random numbers 
^t. The sequence of will then be analyzed according 
to the RS statistic. It will be demonstrated that various 
PRNGs produce sequences of numbers that show devia- 
tions from the asymptotic behavior (^), (^. Moreover, 
it is found that for finite lags r the value of RS(j) differs 
significantly between the tested PRNGs being indicative 
of short range correlations. This way it is possible to ob- 
tain a complete "fingerprint" of correlations of a PRNG 
and to measure their intensity as a function of the lag. 



III. RANDOM GENERATORS 

Because of the vast number of different PRNGs cur- 
rently employed in simulations only a small fraction can 
be selected in this work. 



2 



The generators of the first group, labeled Gl to G7, 
are included as they are in general use - either because 
of traditions, because they are recommended in popular 
books, or because they can be found in many commercial 
software packages. Some of them have documented de- 
fects (G1,G2,G3,G5). These are considered here to study 
how their deviations show up in the RS statistics. The 
generators in the second group, G8 to Gil, have been 
proposed recently to match also future requirements on 
period length and quality. But there is little documented 
experience about their behavior in physical simulations. 
As there are many good reviews and books on the various 
generation methods and the performance in the standard 
tests [p|-^,|^,p7|-p9t only a brief outline of the considered 
algorithms is given in the next section. 



A. Generation Methods 

Most of the commonly used PRNGs are based 
on the linear congruential method. In general a 
multiple recursive generator of order k, denoted by 
MRG(ai, . . . ,ak]c;m), is based on the fcth-order linear 
recurrence 



(aix„_i H h OkXn-k + c) mod™, 



(6) 



where the order k and the modulus m are positive 
integers and the coefficients are integers in the range 
{ — (m — l),...,m — 1}. The numbers x„ of the sequence 
are then scaled to the interval (0, 1) by u„ = a;„/m. 

The special case, where fc = 1, is the well-known lin- 
ear congruential generator LCG(a; c; m) introduced by 
Lehmer or in the homogeneous case, c — 0, the 

multiplicative linear congruential generator, denoted by 
MLCG(a;m). It can be shown that a recursion of or- 
der k with a non-zero constant c is equivalent to some 
homogeneous recurrence of order fc -f- 1 ||5|,p8|]. All con- 
gruential generators show a pronounced lattice structure. 
That means, if n subsequent numbers are used to form 
vectors in the n-dimensional space all points that can be 
generated within the period lie on a family of equidistant 
parallel hyperplanes ||] . Tables with good choices for the 
constants can be found in recent reviews ||,|2^,^,^ . 

A lagged Fibonacci generator, LF(^i, . . . ,lk]rn; o), with 
fc lags is obtained for c = and fc coefficients ai being set 
to unit modulus, the others being set to zero, 



Xn = {xn^h ° ■ ■■ ° Xn-it, ) mod m. 



(7) 



The binary operator o is usually either addition or sub- 
traction. 

The Linear feedback shift register or Tausworthe 
method, LFSR(p, q) , generates a sequence of binary dig- 
its (bits) b„ from the recurrence relation 



where the exclusive-or operation © is equivalent to a bit- 
wise addition modulo two [||J3^. A sequence of pseu- 
dorandom numbers is then obtained by taking an ap- 
propriate number of consecutive bits to form an integer 
number. 

Generalized feedback shift register generators , de- 
noted by GFSR(/i, . . . ,lk;rn), which can be considered as 
a generalization of the Tausworthe generator, are related 
to the lagged Fibonacci method, but use the exclusive-or 
operation instead of the arithmetic operators to combine 
computer words w 



w„ 



® ■ ■ ■ © Wn-u 



(9) 



A generator of this type with two lags (103 and 250) has 
been made popular by Kirkpatrick and Stoll and is known 
as R250 p5| , |36| | (see also ||]). A particular realization 
with four lags has been given by Ziff ||3^ (for test results 
see p^-|2l|). A recently proposed special variant with 
huge period is the twisted GFSR generator, TGFSR jT^- 
The multiply-with- carry generator, denoted 
by MWC(ai, . . . , 0^; c; m), is defined by the recurrence 
relation 



Xn = {aiXn-l + 
C„ = {aiXn~l + 



+ OkXn-k + Cn-i)modm, (10) 
+ OkXn-k + c„_i) divm. 



The div denotes an integer division. Here, in contrast 
to the MRG a carry (or borrow) c„ is propagated to the 
next iteration step. 

Special cases of the MWC are the the add-with- 
carry, AWC(?i, ^2; Ti), and the subtract-with-borrow, 
SWB(Zi, ^2; rn), generators, which are obtained by setting 
two coefficients to unit modulus and all others equal 
to zero | p^ , p8[ . This basically results in a LF generator 
with two lags, but with an extra addition of a carry 



Xn = [xn^h + Xn~i2 +c„_i)modm, 

Cn = [Xn-h + Xn-l2 + C„-l > m] . 



(11) 



In the case of an AWC the bracket indicates the value 
of the carry which is equal to 1 if the inequality is true, 
and equal to otherwise. In the case of an SWB the 
addition operations accordingly have to be replaced by 
subtractions and the borrow is equal to 1 if the result 
of the subtractions becomes negative. These generators 
can produce much longer periods than the underlying LF 
generators, but have a bad lattice structure in dimension 
I + 1, {I being the larger of the lags) ||,||,^. 

The subtraction method, SUB(c; m), is based on a sim- 
ple arithmetic sequence 



{xn-i — c) mod m. 



(12) 



bn — bn—p © bn- 



(8) 



This method is not suitable by itself, but it may be in- 
cluded in combination generators . 

The multiplicative quadratic congruential method, 
MQC the cryptographic BBS Q and DES ^ gen- 
erators, or the inversive congruential generator, ICG 
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are only mentioned for completeness, as these have re- 
ceived considerable theoretical attention recently. These 
new methods have promising features, but the genera- 
tors are currently not in common use as there is little 
practical experience with them. 

In general the PRNGs with several lags require an ini- 
tial set of seeds xi, . . . , Xfe the number of which is deter- 
mined by the largest lag k. While most generators do 
not require a special initialization procedure care has to 
be taken with the GFSR generators. Here an improper 
selection of the seeds can severely affect the quality of the 
sequence of PRNs . Often a congruential generator is 
used to generate the initial state. 

Tausworthe and LFSR generators which are based 
on the theory of primitive trinomials form unfavourable 
structures similar to the lattice structure of LCGs and 
have bad statistical properties . Such simple gener- 

ators should be avoided and combined generators should 
be used instead. 

There is strong empirical support that the combination 
of different pseudorandom sequences in general leads to 
an improved statistical behavior Q,^. The two well- 
known methods are the shuffling of one sequence with 
another or with itself [^j8| or the combination by modu- 
lar addition |^,^ . Hybrid generators based on the first 
method are still not well understood from the theoretical 
viewpoint |^J^. The latter method is better understood 
and is suited to obtain very long periods. Adding two se- 
quences modulo the modulus of either of them the period 
obtained is the least common multiple of the component 
periods. Generators based on such combination methods 
currently provide us with the "best" PRNs. Many differ- 
ent kinds of combined generators have been proposed, see 
Refs. |,|,@,|l|-|6|j2|J^,|§ and references given therein. 

Another common method which can lead to an im- 
provement of a generator is a decimation strategy, that 
means a number of PRNs is thrown away before the next 
random number is delivered. This approach is taken 
for instance in the RANLUX generator [ 46|j47| ] which sig- 
nificantly improves the defective SWB generator RCARRY 
[Qjs^]. But neither shuffling nor the decimation method 
may be desirable if speed considerations are very impor- 
tant (see Appendix y| for timing results) . 

In the following the generators subjected to the RS 
statistical analysis are described in brief. 



B. Tested Generators 

Gl is the well-known MLCG(7^ 231-1), which has been 
proposed as the "mimimal standard' against which 
all other generators should be judged p7| , ^,H. I t 
is also known as GGL ||||], CONG ranO [[l2|]49[, 
SURAND (IBM computers), RNUM (IMSL hbrary), or 
RAND (MATLAB software). It has the serious draw- 
back of a short period, 2^^ — 1, and a pronounced 
lattice structure in low dimensions. Multiplier and 



modulus are not the optimal choice considering sev- 
eral figures of merit, see for instance This gen- 
erator should only be considered as a toy for ex- 
perimenting with new test methods like all other 
simple congruential and LFSR generators. 

G2 is identical to Gl, but additionally Bays-Durham 
shuffling in a table of size 32 is used to improve 
the low-order serial correlations. Here the imple- 
mentation rani of Ref. [[l2|j4g| ] has been applied. 
It is included in this test to show the influence of 
shuffling on the RS statistic. 

G3 is a LF(55, 24; 2^^; — ) generator which has a period 
of 2^^ — 1. It has been devised by Mitchell and 
Moore in 1958 and is described by Knuth Q (orig- 
inally using an add operation). Th is g enerator (a 
version of which is implemented in [^2| as ran3) is 
reported to have significant correlations on the bit- 
level and to fail several physical tests ||ll|,^8|-^ . It 
is included to demonstrate the effect of short range 
correlations on the RS statistic. 

G4 is a modification of the above generator G3. If a dec- 
imation strategy is used, that means, if only every 
fc-th number of the sequence is used, the generator 
passes all of the physical tests in Ref. [|l8|-p0| (for 
k ~ 2 and fc = 3). In this work only the case of 
fc = 3 is considered. 

G5 ist the GFSR(250,103,232) generator R250 proposed 
by Kirkpatrick and StoU [ p5p^ ]. It has a period 
of 2^^". While this generator performs well in the 
standard statistical tests it is reported to fail sev- 
eral physical tests ||Jl^-^ . 

G6 The combination generator RANMAR proposed by 
Marsaglia and Zaman 0,^^ has a period of about 
2^^"^. It is based on the subtraction modulo 2^^ of 
a simple arithmetic sequence 

SUB(7654321;224-3) 
and a subtractive Fibonacci generator 

LF(97, 33; 224;-) 
The initial state is generated by another combina- 
tion of LCG(53; 1; 169) and a multiplicative three- 
lag Fibonacci sequence. The implementation of 
James 0| tested here is in wide-spread use and has 
been recommended as a "universal generator" . 

G7 combines the two congruential sequences 
MLCG(40014;23i -85) 
and 

MLCG(40692;23i - 249) 
by modular addition and applies an additional shuf- 
fling in a table of 32 entries. The period is approx- 
imately 2^2. This algorithm has been invented by 
L'Ecuyer and implemented by James |Q (called 
RANECU). The additional shuffling has been added 
in the version raji2 of Press et al. Many 
recommendation for the improvementffor instance 
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of the speed) of the later version have been given 
by MarsagUa and Zaman (l4| . They reported that 
this generator passes all standard tests. Because 
of its popularity the implementation of Rcf. [^ , ^ 
has been used in the following RS analysis. 

G8 is the recently proposed PRNG mzranlS of Marsaglia 
and Zaman. It combines 

LCG(69069, 1013904243; 2^2) 
and 

SWB(2,3;232 _ 18) 
by modular addition and has a period of about 2^^^ 
||l4| . Although the published program takes advan- 
tage of the inherent modulo 2^^ arithmetic of mod- 
ern CPU's it can easily be made portable to CPUs 
with any larger word size by using bit-masks. 

G9 This is a composite generator of L'Ecuyer |l^ based 
on the modular addition of the sequences of 

MRG(0, 63308, -183326; 0; 2^^ - 1) 
and 

MRG(86098, 0, -539608; 0; 2^1 - 2000169). 
It has a very long period of about 2^'"^ and a lattice 
structure with theoretically better properties than 
G7 |l|. 

GIO This generator is the maximally equidistributed 
three-component Tausworthe generator taus88 de- 
veloped by L'Ecuyer with a period of approxi- 
mately 2®^. 

GU The twisted GFSR generator TT800 proposed by 
Matsumoto and Kurita |l^ has a huge period of 
2800 _ ^ g^j^^ jg reported to have excellent equidis- 
tribution properties up to a dimension of 25. This 
generator is recommended in The tested ver- 
sion includes Matsumoto's code change of 1996 
which improves the lower bit correlations. 



IV. DESCRIPTION OF THE TEST AND 
RESULTS 

A. The Test Setup 

A few additional words have to be said about the gen- 
eration of the initial seeds for the PRNGs. As these are 
(possibly) the only truly random part when generating 
pseudorandom numbers some care should be taken. 

The following method has been applied, as it corre- 
sponds to a common way random generators are used in 
practice: 

The initial seed is calculated from a combination of 
some obviously truly random events, such as the time 
and the date when the program is started, several system 
specific (unique) process identifiers, and the processor 
clock state. For this initial seed a sequence of 10^ to 10^° 
random numbers is generated and analyzed according to 



(|l|). Then for some new random seed another sequence 
is obtained and analyzed. 

This procedure has been iterated until the statisti- 
cal error for the average of RS{t) was considered small 
enough. For each of the generators this amounted to 10^^ 
to 10^2 generated PRNs. 

As this approach does not ensure that the generated 
substreams are disjoint it might look safer to split the 
period into disjoint parts. This could be done for al- 
most all generators, but there are several cases known 
where these (typically) equidistantly spaced seeds intro- 
duce even worse correlations . One should also bear in 
mind that for the long period generators there is only a 
very small probability that, for instance, ten or twenty 
sequences of 10^*^ numbers selected by a random seed are 
not disjoint (of course the period of the "toy" generators 
is exhausted immediately). 

In the case of generators requiring more than one seed 
one initial seed has been generated and mixed into the 
default seeds of the original source code. For instance, 
the 25 published seeds that define the state of the TGFSR 
generator Gil have all been exclusive-or-ed with a new 
random seed every time a new sequence has been gener- 
ated. 

All calculations necessary to evaluate the RS statistic 
have been performed in double precision using IEEE 754 
standard floating point arithmetics. 

The number of PRNs generated in the test of each gen- 
erator is comparable to the number of random variates 
typically required in a nowadays high precision Monte- 
Carlo simulation. Such a number may seem large for a 
mere test, but it comprises the current state of the art in 
research fields like percolation, random walks, diffusion 
limited aggregation, and many others Consid- 
ering the speed of the advances in computer technology 
much larger simulations will be in reach within the next 
few years posing increased demands on precision to the 
PRNGs. Correspondingly the stringency of the empirical 
tests has to increase too. 

In the following section it will be shown that several 
current thought-to-be-reliable PRNGs show pronounced 
correlations in the RS statistic. This does not mean that 
a large scale simulation inevitably produces erroneous re- 
sults with such a PRNG, but it just means that in some 
types of simulations deviations are not unlikely if high 
precision is required. Moreover, the main purpose of this 
paper is to demonstrate that the RS statistic is a candi- 
date to enrich the toolbox of empirical tests for random 
number generators. 



B. Analysis of the RS Data 

In Fig.|l|the dia gram of log RS{t) versus log r is shown 
for all tested random generators. RS{t) has been calcu- 
lated for all powers of two in the range from r = 2 up 
to T = 2^^ « 8 X 10^ as indicated by the dots. After 
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0.0005 



FIG. 1. Double-logarithmic plot of the numerical data (•) 
of RS{t) for all PRNGs. On this scale the results for the var- 



ious PRNGs are indistinguishable. The asymptotic T7r/2 
behavior is indicated by the broken line. 




FIG. 2. Semi-logarithmic plot of RS{T){nT /2)-^/^ - 1 for 
the pseudorandom number generators Gl (o) and G9 (□). 
The lines are intended as a guide to the eye. 



■~r 0.0002 




FIG. 3. 7^(r) versus r for Gl (o) and G2 (*) illustrating 
the effect of a shuffle table. The inset shows a larger range of 




FIG. 4. Upper figure: TZ{t) versus r for the LF generator 
G3 (A). Lower figure: Drastic deviations from the asymptotic 
value (dotted line) are also visible in ARS{t). 



a transient behavior for lags smaller than r « 10^ the 
asymptotic law (^) applies almost perfectly. But on this 
scale the results for the various PRNGs are indistinguish- 
able for all lags. 

To resolve differences between the PRGNs it is con- 
venient to remove the asymptotic trend. In Fig. ^ the 
reduced function RS{t){ttt/2)~^/^ — 1 is displayed for a 
generator with known correlations, Gl (o), and the com- 
bination generator G9 (□). On this scale of magnification 
it can be seen that the simple LCG spectacularly fails to 
approach the expected asymptotic. The relative devia- 
tion becomes as large as 1% corresponding to a reduced 
asymptotic prefactor (which appears to be approximately 
1.243 instead of •\/7r/2 = 1.253). For comparison the data 
for the highly reliable composite MRG G9 are shown. In 
this case the asymptotic expectation value is approached 
smoothly. Due to the large statistical ensemble the error 



bars appear as single lines. 

The distribution of the numerical RS values for all lags 
is well described by the slightly right-skewed asymptotic 
density as given by Feller |^ . The half width of the error 
bars for the estimate of the mean (in this and the follow- 
ing figures) is given by two standard deviations according 
to the asymptotic analytical result (^) . This corresponds 
to a confidence level of about 95%. The numerical results 
for the mean together with the standard deviation of the 
mean are tabulated in Appendix ^ for all generators of 
this test. 

As with several other test statistics where only the 
asymptotic distribution is available one is limited to com- 
pare the generators. Comparing the estimate of the mean 
for finite lags with the asymptotic expectation one could 
always enforce a rejection of a generator if the the num- 
ber of samples is sufficiently increased. In the following a 
method is described which facilitates the comparison of 
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FIG. 5. 7^(r) for the generator G4 (O). Inset: Magnified 
view for small lags r. 
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FIG. 6. 7^(r) for the GFSR generator G5 (+). Inset: Mag- 
nified view for small lags r. 
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FIG. 7. TZij) for the combination generator G6 (x). Inset: 
Magnified view for small lags r. 
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FIG. 8. 7^(t) for the combination generator G7 (□). Inset: 
Magnified view for small lags r. 



RS{t) for the different generators. 

It can be safely assume that the asymptotic limit is 
approached smoothly with increasing r. Therefore any 
apparent local and non-monotone structure in the tran- 
sient will be indicative of correlations. Analyzing the 
functional form of the transient a simple and smooth in- 
terpolation can be found which gives an accurate approx- 
imation for all lags within a range of more than 6 orders of 
magnitude. The transient of RS{t) can be parametrized 

by 



7^(r) 



RS{t) 



1 



1 



+76 



arctan (3t tt 



(13) 



Using only two parameters a, (3 the first two terms suffice 
to approximate the transient with a relative precision of 
« 10"^ for all lags larger than r = 32. The last term 
in has been introduced to approximate the transient 
for lags as small as r = 4. The coefficients have been 
obtained from a numerical adjustment using the mean 
values obtained from the stronger generators G8, G9, and 



GIO with T in the range from 4 to 2^^^. In this range the 
individual results agree to a high precision. The values 
of the coefficients in (nsl) used in the following are 



a fti 1.0319941 7 
/3 w 0.42091184 5 



! 0.10516938 
0.90187633 



0.61775533 



(14) 



The smooth interpolation 7^(r) of the transient now al- 
lows an unbiased comparison of the various PRNGs. As 
the expectation values for finite r are not known the ap- 
proximation (13),(p^) is used instead. The generators 
can now be compared with the approximate transient. 
This approach has been found to be superior to compar- 
ing the generators individually. In particular the influ- 
ence of statistical fluctuations of the mean are minimized 
compared to a pairwise comparison of the generators at 
a given value of r. In the following it will become clear 
that the important point is not to have a precise approx- 
imation of the transient for truly random numbers. The 
detection of a deviation is insensitive to the exact form of 
the approximation: in all cases a defect showed up as a 
pronounced wiggle in RS(t) around the monotone tran- 
sient. Therefore the subtraction of any monotone and 
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FIG. 9. 71{t) for the combination generator G8 (<l). Inset: 
Magnified view for small lags r. 




FIG. 11. TZ{t) for the combination generator GIO (>). In- 
set: Magnified view for small lags r. 
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FIG. 10. 7?.(r) for the combination generator G9 (□). In- 
set: Magnified view for small lags r. 
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T 

FIG. 12. 7^(r) for the TGFSR generator Gil (v)- Inset: 
Magnified view for small lags t. 



slowly varying function would suffice to reveal a char- 
acteristic "fingerprint" of correlations in the PRNG. All 
systematic deviations of TZ{t) from zero are indicative of 
the presence of correlations and the amplitude at lag r 
can be considered as a measure of the strength of corre- 
lations for the given lag. Hence the various PRNGs can 
be compared quantitatively. 

C. Discussion of the Results 

In Fig. I the semi-logarithmic plots of TZ{t) versus log r 
for the toy generators Gl (o) and G2 (*) are shown for 
lags between 4 and 2^^ « 2 x 10^ (inset). Serious devi- 
ations are evident for lags larger than 10"^. Magnifying 
on the vertical axis by a factor of 25 the plot of TI{t) 
reveals deviations also at small lags (main figure). In 
generator G2 additional shuffling in a small table has 
been introduced to improve low order serial correlations 
of generator Gl. For lags up to r w 128 the deviations 
are indeed strongly reduced. As expected there is no im- 
provement for lags which are much larger than the size 
of the shuffling table. 



In Fig. y the results for the lagged Fibonacci genera- 
tor G3 (A) are shown. This generator is known to fail 
several tests (see Ref . |p^-^ and appendix ^ . It is re- 
assuring to see that the RS statistic easily reveals the 
onset of disastrous correlations at t corresponding to the 
larger lag of the generator (/ = 55). The deviations show 
up as a crossover of TZ{t) (upper figure) to a "shifted 
asymptotic" reflecting a modified asymptotic prefactor. 
This gives evidence to the presence of some strong cyclic 
components in the pseudorandom process of G3. This is 
the only generator in this test showing also deviations of 
ARS{t) from the asymptotic value (Fig. |4| lower graph). 

If a decimation strategy with /c = 3 is applied, cor- 
responding to generator G4 (O), the correlations are 
strongly suppressed (Fig. ||). 

The GFSR generator G5 (Fig. ||) uses larger lags than 
G3 shifting the onset of correlations to larger r. The 
magnitude of the deviation is even twice as large as that 
of generator G3. These dramatic deviations are obviously 
indicators for the poor behaviour of G5 in some Monte- 
Carlo (MC) simulations [Q. 

Pseudorandom numbers of much better quality are ex- 
pected from combination generators which can overcome 
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the weakness of generators which are structurally too 
simple. 

In Fig. 1^ the performance of the popular combination 
generator G6 (+) can be estimated. When t is somewhat 
larger than the lags of the LF component of the generator 
significant deviations in TZ are observed (similar to G3 
and G5). These are presumably due to the deficient LF 
component of the composite generator. But compared 
to G5 the deviation is about 10 times smaller. For the 
time being there are no documented failures in physical 
simulations that use this generator But comparing 
Fig. with Figs. |,| one can conclude that deviations in 
MC simulations are not implausible if higher precision is 
demanded. 

PRNGs which are as fast, but which have better long- 
range properties are discussed in the following. In the 
next figure, Fig. ^, the results for the combined congru- 
ential generator G7 (x) are shown. Compared to the 
previous generators the amplitude of the deviations is 
drastically decreased. But for lags in the range r = 2^^ to 
2^ a structure being indicative of correlations can be re- 
solved (see inset of figure and Tab. ^) on a high level of 
significance. Although G7 is doubtlessly one of the better 
generators within this test it should be immediately evi- 
dent that it cannot come up to the expectations of Press 
and Teukolsky [|l:2|j49[ to provide perfect random numbers 
(within the limits of its floating point precision). Thus 
their proposed "practical" definition of perfect should at 
least be put into perspective. 

Random numbers of much better quality (at least in 
the RS statistics) are generated by the recently proposed 
composite generators G8 to Gil. For all lags in the range 
2^ to 2^^ there are no significant differences. These four 
PRNGs are based on four different generation methods. 
Generator G8 applies a combination of generators with 
different algebraic structure while the two-component 
MRG G9 and the three-component Tausworthe generator 
GIO combine generators of the same class. Finally, Gil 
is a TGFSR generator which distinguishes itself by an 
extraordinary long period [5T| . The fact that four gener- 
ators of completely different algebraic structure and with 
theoretically favourable properties give consistent results 
reassures that the observed deviations of the other gen- 
erators are indicators of real defects. 

It should be noted that RS{t) necessarily has been 
sampled on a coarse grid on the logarithmic scale. There- 
fore it is possible that several types of correlations which 
would have shown up as a narrow structure have not been 
recognized. Nevertheless the observed deviations are in- 
triguing. 



tical method has been described which makes it easy to 
obtain a characteristic fingerprint of the correlations in 
a pseudorandom sequence. The deviations can be de- 
scribed quantitatively and the performance of generators 
for some given range of lags can be compared. 

To illustrate the capability of the RS statistical anal- 
ysis several popular generators have been subjected to 
an extensive test. The randomness of all tested PRNGs 
whith known defects could be refuted. Moreover devia- 
tions in several generators which are thought to be re- 
liable have been quantified. Thus the RS analysis has 
to be considered more stringent than many of the previ- 
ously suggested tests in the sense that more generators 
fail it. 

The selection of a PRNG for a specific simulation de- 
pends on the required level of precision and on the range 
of the correlations which may have an impact on the 
quantity of interest ~ although this often cannot be as- 
sessed in advance. But no generator showing a per- 
formance inferior to another generator in several tests 
should be used any longer if it doesn't even distinguish 
itself at least by speed. Weak correlations in a current 
state-of-the-art generator (like some of this test) can lead 
to erroneous results in a tomorrow high-precision calcu- 
lation. 



T 


Gl 






G2 






G3 




2' 


8.836(3.84)10 




2 


1.572(3.84)10" 


b 




-1.228(6.61)10" 


b 


2^ 


2.331(0.45)10 




5 


-1.166(4.50)10" 


6 




3.183(7.76)10" 


6 


2-* 


3.075(0.57)10 




5 


2.972(5.68)10" 


6 




1.010(9.79)10" 


6 


2^ 


3.969(0.75)10 






1.045(0.75)10" 


5 




-1.026(1.29)10" 


5 


2'5 


6.990(1.01)10 




6 


1.348(1.01)10" 


5 




-3.324(0.17)10" 


-4 


2^ 


4.659(1.38)10 




3 


1.593(13.8)10" 


6 




2.483(0.24)10" 




2^^ 


-2.911(19.0)10" 


-6 




-6.774(19.0)10" 


6 




-9.531(0.33)10" 




2« 


-2.170(2.65)10" 


-5 




-3.392(2.65)10" 


5 




-9.763(0.35)10" 




2l0 


6.632(3.71)10" 


-5 




4.517(3.71)10" 


5 




-9.344(0.49)10" 




2" 


7.057(0.52)10 


-4 


13 


7.230(0.52)10" 


4 


13 


-7.032(0.69)10" 




2l2 


1.249(0.07)10 


^ 


17 


1.192(0.07)10" 


■3 


16 


-5.987(0.98)10" 




2" 


-8.803(9.78)10" 


-5 




-4.659(0.98)10" 




4 


-5.266(0.82)10" 




2" 


-2.627(0.13)10 


-a 


20 


-2.637(0.13)10" 


-s 


20 


-4.612(1.08)10" 




2l5 


-3.263(0.31)10 




10 


-3.378(0.31)10" 


-s 


11 


-2.177(1.53)10" 


4 


2l6 


-4.948(0.43)10 




11 


-5.969(0.43)10" 


-s 


13 


2.702(21.5)10" 


5 


2i^ 


-5.529(0.61)10 




9 


-6.934(0.61)10" 




11 


3.151(30.4)10" 


5 


2l8 


-7.006(0.86)10 




B 


-5.760(0.86)10" 




6 


7.069(4.30)10" 


4 


2l9 


-9.363(1.22)10 




7 


-8.212(1.22)10" 


-s 


6 


1.232(6.08)10" 


4 



TABLE I. The numerical values of '7J(t) are tabulated in 
columns for the generators Gl, G2, G3. The value of one 
standard deviation (cr) of the mean is given in parenthesis. If 
the deviation is larger than 2a the value is framed and the 
deviation in units of o is attached to the right. 



APPENDIX A: NUMERICAL RESULTS 



V. CONCLUSIONS 

The sensitivity for correlations on all scales and the ro- 
bustness predestinates the RS statistic as a tool to catch 
up defects in pseudorandom number generators. A prac- 



The numerical results for the mean of "^(t"), as depicted 
in previous figures, are reported in tables | to The 
value of one standard deviation of the mean is given in 
parenthesis. Values which differ from zero by more than 
two standard deviations are framed and the deviation in 
units of standard deviations is printed behind the box. 
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TABLE II. The numerical values of 7?.(r) are tabulated in 
columns for the generators G4, G5, G6. See Tab. | for an 
explanation. 
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TABLE IV. The numerical values of TZ(t) are tabulated 
in columns for the generators GIO, Gil. See Tab. ^ for an 
explanation. 
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TABLE III. The 


numerical values of TZ{t) 


are tabulated 


in 



explanation. 



APPENDIX B: TIMING RESULTS 

In table ^ the typical execution times relative to the 
generator Gl are given. All generators have been config- 
ured to deliver one PRN per function call and no function 
code has been inlined. Although the figures may scatter 
between different architectures, compilers and optimiza- 
tion options they should be indicative for the relative 
performance on work station type computers. It should 
be mentioned that in the case of combined MLCGs and 
combined MRGs (G7,G9) a floating point implementa- 
tion is often much faster than an integer implementation 
on many modern CPUs. These versions can compete 
with the fastest generators of table 0. |Q 



PRNG 


rel. time 


PRNG 


rel. time 


Gl 
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G7 
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G8 
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TABLE V. Relative execution times of the generators con- 
sidered in this test. 



APPENDIX C: ADDITIONAL RESULTS 

For comparison the performance of the generators Gl- 
Gll in the recently proposed n-hlock test and the ran- 
dom walk test ||l^-|20| has been calculated. For the group 
of PRNGs which have already been considered in Ref. 
[^-20 1 the results were reproduced. The figures for all 
generators tested newly are reported in Tab. VI. Accord- 
ing to Ref. p^|20|| the limit of acceptance in the x^"test 
has been chosen < 7.815 in the case of the random 
walk test and < 3.841 for the n-hlock test. A genera- 
tor is assumed to pass the test if in at least two of three 
independent runs the value of is below the given limit. 

The only PRNGs which shows significant deviations 
from the expected distributions are generators G3 and 
G5. If the decimation strategy is used then G3 also passes 
these tests (corresponds to G4). 

These results have to be contrasted with the perfor- 
mance of the PRNGs in the RS statistical analysis which 
is much more stringent in the sense that more generators 
fail it. 

From the presented figures it is obvious that the walk 
length (block size) in these tests is too small (by orders of 
magnitude) to catch the severe defects at lags that cor- 
respond to the large walk lengths in realistic simulations. 
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It is also evident that it is not sufficient to consider only 
a fixed lag as the amplitude of the deviations can vary 
strongly with the lag. Finally the RS statistic appears 
to be superior considering its sensitivity for correlations. 
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TABLE VI. Results for three runs of the random walk test 
(walk length A'^ = 750 using 10® samples) and of the n-block 
test (block size = 500 using 3 x lO'* samples) ||l|-|§. The 
framed figures indicate a failure in this test. 
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